Monte Carlo tree search
Monte Carlo tree search - Wikipedia
For balancing exploitation and exploration, UCT (Upper Confidence Bound applied to trees) algorithm was introduced by Levente Kocsis and Csaba Szepesvári.
Kocsis, L. and Szepesvári, C., 2006, September. Bandit based monte-carlo planning. In European conference on machine learning (pp. 282-293). Springer, Berlin, Heidelberg.
I explain about Upper Confidence Bound(UCB1) algorithm in (2.2.3.2-2) UCB1 algorithm.
Related:
exploration-exploitation tradeoff: (2.2.3.1) Exploration-exploitation tradeoff
en.icon